Improved Tonal Language Speech Recognition by Integrating Spectro-Temporal Evidence and Pitch Information with Properly Chosen Tonal Acoustic Units
نویسندگان
چکیده
We propose an improved Tandem system for tonal language speech recognition. Three different types of features, cepstral, spectro-temporal and pitch features, are integrated for modeling tone and phoneme variation simultaneously. Tonal phonemes (or tonemes) are used for MLP posterior estimation, and tonal acoustic units for HMM recognition. In our experiments conducted on Mandarin broadcast news, a 19.3% relative CER reduction was achieved over the conventional MFCC Tandem baseline. With different training acoustic units, we analyze the complementarity among the three types of features in tone, phoneme, and toneme classification.
منابع مشابه
Tonal Languages and Cochlear Implants
As a major part of world languages, tonal languages are spoken in every continent except for Australia. In a tonal language, voice pitch variation (i.e., tone) at the monosyllabic level is a segmental structure that conveys lexical meaning of a word (Duanmu 2000). Mandarin Chinese, a tonal language, is spoken by more people than any other single language, including non-tonal languages. While so...
متن کاملExperiments on Chinese Speech Recog Pitch Estimation Using the M
Automatic speech recognition of a tonal and syllabic language such as Chinese Mandarin poses new challenges but also offers new opportunities. We present approaches and experimental results concerning the choice of base units for acoustic modeling, pitch estimation and how to integrate pitch estimates into the modeling framework. The experimental evaluations are carried out both on rather clean...
متن کاملOn Integrating Tonal Information into Chinese Speech Recognition
Reliable pitch detection is important in Chinese speech recognition since Chinese is a tonal language. In this paper, several pitch information integration approaches are investigated. In a noise-free environment, conventional pitch estimators work quite well. In adverse conditions, however, robustness of pitch detection algorithms is still a challenging problem. Our experimental results show t...
متن کاملThe Critical Elements for Cantonese Speech Recognition – Implications for the Design of Digital Hearing Aid Amplification Algo
Hearing aid amplification remains as the only effective intervention method for people suffering from sensory hearing loss with cochlea origin. The design and validation of hearing aid amplification algorithms is based on the speech acoustics of non-tonal languages such as English. Chinese tonal languages (such as Cantonese and Mandarin) are fundamentally different from non-tonal languages that...
متن کاملF0 Contour Analysis Based on Empirical Mode Decomposition for DNN Acoustic Modeling in Mandarin Speech Recognition
Tone information provides a strong distinction for many ambiguous characters in Mandarin Chinese. The use of tonal acoustic units and F0 related tonal features have been shown to be effective at improving the accuracy of Mandarin automatic speech recognition (ASR) systems, as F0 contains the most prominent tonal information for distinguishing words that are phonemically identical. Both long-ter...
متن کامل